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Circuit for recursively calculating data 



FIELD OF THE INVENTION 

The present invention relates to a circuit for calculating a second data set 
based on a first data set calciilated by at least a calculation device that is capable of 
calculating a data in a predefined number of clock cycles, said calculation device having an 
input and an output. 

The invention also relates to a system for calculating intracolumn permutation 
elements of an interleaver, a decoding circuit comprising such a system, an electronic device 
and a communication network comprising such a decoding circuit. 

The invention finds an application, for example, in a satellite conununicatioh 
system or a system implementing the UMTS (UMTS = Universal Mobile 
Telecommunication System) standard, such as a third generation mobile telephone. 

BACKGROUND OF THE INVENTION 

Certain data processing systems perform a recursive calculation of data which 
necessitates the calculation of a data set based on another data set. For example, a calculation 
of data bj[i] may be performed where i and j are indices, i varying from 0 to n and j firom 0 to , 
m, m and n being non-zero integers. This is notably the case in a calculation of a power 
matrix. 

Fig. 1 represents an example of data to be calculated , by such a processing 
system. In this example the integer m has the value 9 and the integer n the value 4. Five data 
sets are calculated, bo[0] to b9[0], bo[l] to b9[l], bo[2] to b9[2], bo[3] to b9[3], and bo[4] to 
b9[4]. The processing system calculates the data bo[0] to b9[0] respectively, then bo[l] to b9[l] 
and so on. A data set depends on the preceding data set. For example, bo[l] is a function of 
bo[0] via a fimction f: 

bo[l] = f(bo[0]). 

Similarly, bi[l] = f(bi[0]), b2[l] = ffbzlO]) and so on. In a general way: 

bj[i+l] = f(bj[i]). 

Fig. 2 illustrates a circuit which permits to perform such a calculation. Such a 
circuit comprises a memory 21, a controller 22 and a calculation device 23. The example 
hereinafter describes the calculation of a second data set bo[2] to b9[2] based on a first data 
set bo[l] to b9[l]. In this example the calcxilation of a data by the calculation device 23 
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requires one clock cycle. The data of the first data set bo[l] to b9[l] are stored in the memory 
21. During a clock cycle the data bo[l] is sent to the calculation device 23 which then 
calculates the data bo[2]. This data is then stored in tiie memory 21. With the next clock cycle 
the data bi[l] is sent to the calculation device 23 which then calculates the data bi[2]. This 
data is then stored in the memory 21. The circuit similarly proceeds for the calculation of the 
datab2[2] tob9[2]. 

The controller 22 controls the sending of a data of the first data set to the 
calculation device 23 for the calculation of a data of the second data set. In order to do this, 
the controller 22 generates an address firom the memory 21 at which said data of the first data 
set is stored. The memory 21 is a RAM memory (RAM = Random Access Memory). When 
the memory 21 receives an address firom the controller 22, it sends the data stored at this 
address to the calculation device 23 . 

Such a circuit thus requires a random access memory and a controller. Such a 
memory and such a controller cover a considerable silicon surface and take up a considerable 
amount of current This is a drawback, notably in portable electronic devices such as a 
mobile telephone.. Actually, in a portable electronic device the available silicon surface is 
limited. Moreover, as such a device is fed by a battery, a low current consumption is 
important in order to avoid too firequent a recharging of said battery. 

SUMMARY OF THE INVENTION 

It is an object of the invention to propose a circuit for calculating a second data 
set based on a first data set, said circuit occupying a reduced silicon sxuiace and presenting a 
reduced current consumption. 

A circuit according to the invention and as defined in the opening paragraph is 
characterized in that it comprises transport means for routing a data of the first data set firom 
the ou^ut to the input of the calculation device, in a number of clock cycles depending on the 
number of data of the first data set and of the predefined number of cycles necessary for the 
calculation of a data, a data advancing through said transport means with each clock cycle. 

When a data of the first data set is calculated by a calculation device and is to 
be used by this calculation device several clock cycles later for calculating a data of the 
second data set, the data of the first data set is routed to the input of the calculation device by 
transport means, controlled solely by said clock. The transport means are such that the data of 
the first data set reaches at the input of the calculation device at the moment when it is to be 
used by said calculation device. Thus the circuit does not need to have a random access 
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memory nor a controller which permits to reduce the consumption of such a circuit as well as 
the silicon surface covered by such a circuit 

Advantageously, tihie transport means comprise regulation means for regulating 
the mmiber of cycles necessary for routing a data from the output to the input of said 
5 calculation device. Such a circuit has then a large flexibility. In fact, the data sets to be 
nroc^ssed by the circuit may have a variable number of data. The nimiber of cycles necessary 
I outing a data from the output to the input of the calcxilation device depends, inter alia, on 
the number of data of the data sets. Thanks to the regulation means it is possible to regulate 
the number of cycles necessary for routing a data from the output to the input of the 
f ' ^ calculation device as a function of the number of data of the data sets to be processed. Thus, 
^uch 9 circuit may be used for processing data sets which have different numbers of data. 

In a preferred embodiment the transport means coniprise at least a clock- 
activated register, said register bemg capable of storing a new data with each clock cycle. 
According to tins embodiment the transport means comprise solely registers capable of 
15 storing one data. Such registers cover little silicon surface and have low current consumption. 
Such a circuit is ftirthermore easy to design, the number of such registers corresponding to 
the number of cycles necessary for routing a data from the output to the input of the 
calculation device. 

20 BRIEF DESCRIPTION OF THE DRAWINGS 

These and other aspects of the invention are apparent from and will be 
elucidated, by way of non-limitative example, with reference to the embodiment(s) described 
hereinafter. 

25 

In the drawings: 

Fig. 1 illustrates an example of data to be calculated; 

Fig. 2 is a block diagram illustrating a prior-art circuit for the calculation of 

the data of Fig. 1; . 
30 - Fig. 3 is a block diagram illustrating a circuit according to the invention; 

Fig. 4 is a block diagram illustrating a circuit in accordance with an 
advantageous embodiment of the invention; 

Fig. 5 illustrates a circuit in accordance with the invention for the calculation 
of multiplication accumulations; 
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Fig. 6 illustrates a commumcation network comprising a circuit in accordance 
with the invention; 

Fig. 7 illustrates a calculation of an interleaving matrix and of an interleaved 

block; 

Fig. 8 illustrates a circuit in accordance with the invention for the calculation 
of intracolimm permutation elements of an interleaver. 

DESCRIPTION OF PREFERRED EMBODIMENTS 

Fig. 3 illustrates an example of a circuit in accordance with the invention. 
Such a ckcuit comprises a calculation device 3 1 which has an input 3 1 1 and an output 3 12, as 
. well as transport means 32. In this example the transport means comprise nine registers 321 
to 329. The calculation device 31 may further receive additional data 34 such as coefficients. 

The example described hereinafter shows how a second data set is calculated 
based on a first data set by means of the circuit of Fig. 3. This example is applied to a second 
data set bo[2] to b9[2] and to tiie first data set bo[l] to b9[l] of Fig. 1. 

Previously, the data of the first data set are calculated based on initial data 
corresponding to the data set bo[0] to b9[0] of Fig. L These data are sent in tiie form of 
additional data 34 to the calculation device 31. During a first clock cycle the data bo[0] is sent 
to the calculation device 31. The data bo[l] is then calculated by the calculation device 31 
and stored in the register 321. It will be noted that the data bo[l] may be stored in parallel in a 
storage device not shown in Fig. L During a second clock cycle tiie data bi[0] is sent to the 
calculation device 31. The data bi[l] is then calculated by the calculation device 31 and 
stored in the register 321 instead of the data bo[l] which is sent to the register 322. Actually, 
the registers 321 to 329 are activated by the clock, that is to say, at each clock cycle the data 
present in a register leaves this register. 

Similar operations are carried out for the calculation of the data b2[l] to b9[l]. 
During a tenfli clock cycle the data bi[0], stored in the register 329, is sent to the input 3 11 of 
the calculation device 31, whereas tiie data b9[l] is calculated by the calculation device 31 
and sent to the register 321. 

During an eleventh clock cycle the data bo[2] of the second data set is 
calculated by the calculation device 31, based on the data bo[l]. This data bo[2] is then stored 
in the register 32L During this eleventh clock cycle the data bi[l], stored in the register 329, 
is sent to the input 311 of the calculation device 31. During a twelfth clock cycle the data 
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bi[2] is calculated by the calculation device 31 and stored in the register 321. Similar 
operations are carried out for the calculation of the data b2[2] to b9[2]. 

In this example it is supposed that the calculation of a data by the calculation 
device 31 requires a single clock cycle. It is possible for such a calculation to require various 
clock cycles. For example, let us suppose that such a calculation requires three clock cycles. 

Diuing a first clock cycle the data bo[0] is sent to the calculation device 31. 
During a second clock cycle the data bi[0] is sent to the calculation device 31. During a third 
clock cycle the data b2[0] is sent to the calculation device 31. During this third clock cycle 
the data bo[l] is calculated, since the calculation of a data necessitates three clock cycles. 
This data is then stored in the register 321, During a tenth clock cycle the data b9[0] is sent to 
the calculation device 31. The data bo[l] is then situated in the register 327 and is to be sent 
to the calculation device 31 so as to initiate the calculation of the data bo[2] of the second 
data.set Consequently, tite transport means 32 requke only seven registers 321 to 327. 

As a result, the number of clock cycles necessary for routing a data jfrom the 
output to the mput of the calculation device 31 depends both on the numbw of data of the 
data sets and on the number of clock cycles necessary for the calculation of one data. In a 
general way, if the data sets comprise k data and if the number of clock cycles required for 
the calculation of one data has the value 1, the number of clock cycles necessaiy for the 
routing of one data from the output to the input of the calculation device 31 has the value (k- 
1). In the example of Fig. 3 this means that the transport means require (k-1) registers 
activated by the clock. 

In the preceding examples it was supposed, inter aha, that the calculations are 
pipelined, that is to say that with each clock cycle one data is sent to the calculation device 
31. It is possible that a data is not sent to the calculation device 31 with each clock cycle, 
25 notably when the circuit in accordance with the invention comprises various calculation 
devices. In such a case the mraiber of clock cycles necessary for routing a data from the 
output to the input of a calculation device also depends on the number of data of the data sets 
and on the nimiber of clock cycles necessary for the calculation of one data, as is discussed in 
more detail with respect to Fig. 5. 
30 Fig. 4 illustrates a circuit according to an advantageous embodinaent of the 

invention. Such a circuit comprises, in addition to the elenients mentioned with respect to 
Fig. 3, regulation means for regulating the number of cycles necessary for routing a data from 
the output to the input of the calculation device 31, in the form of a multiplexer 35. The 
multiplexer 35, controlled by a control circuit not shown in Fig. 4, peraiits to send to the 



wo 2004/030225 PCT/IB2003/003943 

6 

input 311 of the calculation device 31 the data stored either in the register 323 or in the 
register 327 or in the register 329. Thus, it is possible to regulate the number of cycles 
necessary for conveying a data from the output to the input of the calculation device 31. 
Actually, if the data stored in the register 323 is selected to be sent to the input of ttie 
calculation device 31, the number of cycles necessary for the routing of a data from the 
output to the input of the calculation 3 1 has the value 3. If the data stored in the register 327 
is selected to be sent to the input of the calculation device 31, the number of cycles necessary 
for routing a data from the output to the input of the calculation device 3 1 has the value 7. 

Consequently, such a circuit may be used for processing data sets which have 
diverse numbers of data. For example, for processing data sets comprising four data, while 
supposing that the calculations are pipelined and that the calculation of one data by the 
calcxilation device 31 requires one clock cycle, the data stored in the register 323 is selected 
to be sent to the input 311 of the calculation device 31. For processing data sets comprising 
eight data, the data stored in the register 327 is selected. For processing data isets comprising 
ten data, the data stored in the register 329 is selected. 

Obviously, the regulation means may be designed in a way so as to peraiit the 
selection of a data from each of the registers 321 to 329. Thus it is possible to process data 
sets comprising a number of data between 2 and 10 in the case where the calculation of a data 
by the calculation device 3 1 needs one clock cycle. 

Fig, 5 represents a circuit in accordance with the. invention for the 
multiplication-accumulation calculation. Such a circuit comprises four calculation devices 41 
to 44. These calculation devices are adders. With each calculation device 41 to 44 is 
associated a multiplier, 410 to 440 respectively. With each calculation device are also 
associated three registers, 411 to413,421 to423,431 to433 and 441 to 443, respectively. 

The circuit of Fig. 5 is intended for a calculation of four results of 
multiplication-accumulation MACl to MAC4, based on sixteen dati di to di6 and sixteen 
coefficients ci to Cie: 

MACl — Ci*di + C5*d5 + C9*d9 + Ci3*dl3 
MAC2 = C2*d2 + C6*d6 + Cio*dio + Ci4*di4 

MAC3 = C3*d3 + C7*d7 + cn^dn + Ci5*di5 

MAC4 = C4*d4 + C8*d8 + Cl2*di2 + Ci6*di6 

Such a circuit is used, for example, in a decoding filter for data transmitted in 
the MP3 format. The data are transmitted in the form of data bands, each band being divided 
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into sub-bands. The circuit of Fig. 5 is controlled by a clock. With each clock cycle a data 
reaches the circuit and is sent to one of the multipliers 410 to 440. The data di is sent to the 
multiplier 410, tibie data d2 to the multiplier 420, the data 6.3 to the multiplier 430, the data d4 
to the multiplier 440, the data ds to the multiplier 410 and so on. 

During a first clock cycle the coefiBcient ci is sent to the multiplier 410, the 
data ci*di is calculated and then a zero value is added thereto by the calculation device 41. 
The data ci*di is then sent to the register 411. During a second clock cycle the coefiBcient C2 
is sent to the multiplier 420, the data C2*d2 is calculated and then a zero value is added thereto 
by the calculation device 42. The data C2*d2 is then sent to the register 421. Similar 
operations are carried out for calculating the values C3*d3 and C4*d4 which are sent to the 
registers 431 and 441, respectively. The data Ci*di, C2*d2, C3*d3 and C4*d4 form a first data 
set. 

During a fifth clock cycle the coefficient C5 is sent to the multiplier 410, the 
data C5*d5 is calculated and then the data ci*di is added thereto by the calculation device 41. 
Actually, during the fourth clock cycle the data Ci*di, which has advanced through the. 
registers 411, 412 and 413 during second, third and fourth clock cycles, is sent to the 
calculation device 41. The data Ci*di + C5*d5 calculated by the calculation device 41 is then 
sent to the register 411. Similar operations are carried out during a sixth, a seventh and an 
eighth clock cycle for calculating the data C2*d2+ C6*d6, C3*d3 + C7*d7 and C4*d4 + C8*d8. The 
data ci*di + C5*d5, C2*d2 + C6*d6, C3*d3 + C7*d7 and C4*d4 + C8*d8 form a second data set 
calculated on the basis of the first data set. 

Fig. 6 illustrates a communication network comprising a circuit in accordance 
with the inventioiL Such a network comprises an encoding device ENC, a transmission 
channel CHAN and a decodmg circuit DEC. At the level of the encoding device ENC, a data 
vector SI to be transmitted is coded by a first systematic recursive coder 61, to produce a first 
parity vector PI. In parallel therewith, the data of the data vector SI are interleaved by a first 
interleaver 62 and tiiie vector resulting therefirom is coded by a second systematic recursive 
coder 63 to produce a second parity vector P2. 

The interleaving of the data of a vector consists of permuting the components of this vector in 
a predefined order so as to obtain another vector. In the following there will be indifferently 
mention of the interleaving of data of a vector or the interleaving of the vector, so as to 
simplify the description. 
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Subsequently, the data vector SI, the first parity vector PI and the second 
parity vector P2 are sent over the transmission channel CHAN to a receiver (not shown in 
Fig. 6). This is done by a transmitter (not shown in Fig. 6). The data vector SI, the first parity 
vector PI and the second parity vector P2 are then sent to the decoding circuit DEC. 

The decoding circuit DEC comprises a first decoder 64, a second decoder 66, a 
second interleaver 65, a third interleaver 67 and a de-interleaver 68. In the example of Fig. 1 
the decoders 64 and 66 are sofl-input-soft-output decoders. (SISO). 

This decoding circuit DEC operates in iterative maimer. During an iteration 
the first decoder 64 calculates a first extrinsic output data vector based on the data vector SI 
received, the first parity vector PI received and an extrinsic data vector coming firom the 
second decoder 66. If there is not yet an extrinsic data vector coming firom the second 
decoder 66, it is replaced by a predefined vector, for example a unit vector. This is possible 
during the first iteration of a decoding. 

The first extrinsic output data vector is interleaved thanks to the second 
interleaver 65 and the vector resulting therefirom is sent to the second decoder 66. The second 
decoder 66 then calculates a second extrinsic output data vector based on the second parity 
vector P2, on a vector S2 coming firom the third interleaver 67 which has for its input ithe data 
vector SI, and on the vector coming firom tlie second interleaver 65: The second extrinsic 
output data vector is then de-interleaved by the de-interleaver 68 and the vector resulting 
thereft'om is sent to the first decoder 64. A new iteration may then be performed. 

Such a decoding circuit may be used in an electronic device, such as a third- 
generation mobile telephone. 

The interleaving of the data requires the calculation of intracolumn 
permutation, elements as is described with reference to Fig. 7. Such a calculation of 
intracolumn permutation elements is carried out by a system comprising a circuit according 
to the invention as this is described with reference to Fig. 8. 

Fig. 7 illustrates a calculation of an interleaving matrix and of an interleaved 
block, carried out by an mterleaver of the communication network of Fig. 6. The example 
described hereinafter is appUed to an interleaver according to the "3GGP TS 25.212 V3.9.0 
(2002-03)" standard. 

An object of such an interleaver is to permute the positions of the data 
comprised in a data vector containing K bits, K being an integer between 40 and 5114. The 
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interleaver transforms the data vector into an interleaved data vector thanks to an interleaving 
scheme defined by an interleaving matrix containing R rows and C columns. 

The example of Fig. 7 illustrates how the interleaving matrix is defined and 
how the bits of a data vector are interleaved. In this example a data vector B comprising 25 
5 bits is interleaved and an interleaved data vector B' is obtained. It will be noted that this 
f^x^rnple has for an object to show in a simple manner how an interleaved data vector B' is 
vbtamed. More particularly, this example does not correspond to the "3GGP TS 25.212 
V3.9.0 (2002-03)" standard, in which the length K of a data vector is between 40 and 5114. 

In this example each bit of the data vector B is identified by an identifier 
M> between 0 and 24. The identifiers are written in a first matrix Ml row by row. Then, an 
intracolvimn permutation is carried out in the matrix Ml according to an intracolunm 
permutation scheme, and a matrix M2 is obtained. An intercolumn permutation is then 
performed in the matrix M2 according to an intercolumn permutation scheme, and a matrix 
M3 is obtained. This matrix M3 is flie interleaving matrix, 
1 5 The identifiers of the bits of the interleaved data vector B' are then obtained by 

a column-by-column reading of the identifiers of the interleavmg matrix. In this example the 
bit identified by the identifier «0», which is found in the first position in the data vector B, 
is located at the twenty-fourth position in the interleaved data vector B'. The bit identified by 
the identifier «5» in the data vector B is situated at the second position in the interleaved 
20 data vector B', and so on. 

For each value of K an interleaving scheme is defined. In order to 
make this, an intracolumn permutation scheme and an intercolumn permutation 
scheme are defined. The standard mentioned above specifies four intercolumn 
permutation schemes defined in the Table 1. For example, the intercolumn 
25 permutation scheme identified by number 1 replaces the first row of the matrix M2 
which is denoted «0», with the twentieth row of the matrix M2 which is denoted 
«19», the second row with the tenth row and so on. 



Numb^ of scheme 


Intercolumn permutation scheme 


1 


ri9 9 14 4 0 2 5 7 12 18 10 8 13 17 3 1 16 6 15 111 


2 


ri9 9 14 4 0 2 5 7 12 18 16 13 17 15 3 1 6 11 8 101 


3 


[9 87654321 01 


4 


[4 3 2 1 01 



Table 1 : intercolumn permutation scheme 
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The number of rows of the interleavmg matrix, as well as the inter column 
permutation scheme, depends on the length K of ttie data vector as is described in Table 2. 
This Table is stored in a memory and, knowing the length K, the interleaver detemiines the 
number R of rows of the interleaving matrix as well as the intercolunm permutation scheme 

5 to be used. Consequently, for interleaving a data vector that has a given length K, the 
interleaver need not calculate the number of rows of the interleaving matrix nor the 
intercolunm permutation scheme, because these parameters are predetermined. 

Conversely, it is not possible to store the intracolunm permutation schemes for 
each possible niraiber C of columns. Actually, the number C of columns may take any iiiteger 

10 value between 2 and 256. Consequently, storing the intracolunm permutation schemes for 
each possible number C of columns requires too much memory capacity: Therefore, the 
intracolwnn permutation scheme is calculated each, time a data vector possessing a new 



length K is to be interleaved. 



K 


Scheme numb^ 


R 


40<K:<159 


4 


5 ' 


160<K:^00 


3 


10 


201<K^80 


1 


20 


481<iC^530 


3 


10 


531^K^280 


1 


20 


2281<K^480 


2 


20 


2481<k<3160 


1 


20 


3161<K<3210 


2 


20 


3211<K<5114 


1 


20 



Table 2: intercolunm permutation schemes and R as a function of K 



15- . 

In order to calculate the intracolumn permutatipn scheme for a given length Ky 

the parameters described hereinafter are determined. 

In the first place a prime niraiber p is determined. This number p is the 

smallest prime number so that (p- 1) — K/R ^0. 

20 Then the number C of columns is detennined. This number C is the smallest 

integer from the set of integers {(p-1), p, (p+1)} so that K ^ R*C. 

A primitive root v is then determined as a function of the prime number p, as 
is described in Table 3. 



_E. 


V 


P 


V 


P 


V 


P 


V 


7 


3 


59 


2 


113 


3 


191 


19 


11 


2 


61 


2 


127 


3 


193 


5 


13 


2 


67 


2 


131 


2 


197 


2 


17 


3 


71 


7 


137 


3 


199 


3 
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73 


5 


139 


2 


211 


2 






79 


3 


149 


2 


223 


3 






83 


2 


151 


6 


227 


2 


31 


3 


89 


3 


157 


5 


229 


6 


37 


2 


97 


5 


163 


2 


233 


3 


41 


6 


101 


2 


167 


5 


239 


7 


43 


3 


103 


5 


173 


2 


241 


7 


47 


5 


107 


2 


179 


2 


251 


6 


53 


2 


109 


6 


181 


2 


257 


3 



Table 3: primitive root v as a function of the prime number p 

Subsequently, a sequence of minimal prime integers fl is . calculated. This 
sequence is composed of R values and is constructed as follows: 

5 - q[0] = l 

- forj>0,qO] is the smallest prime number so that: 

• the highest common divisor between q[3] and (p-1) is 1 

• qD]>6 

. • q[j]>qD-l]. 

10 Then, a permuted sequence of minimal prime integers r is calculated by 

utilizing the intercolxmrn permutation scheme T: r[T[j]] = q[j]. 

A basic sequence s is then calculated. This sequence is composed of p-1 
values and is constracted as follows: 

- s[0]=l 

15 . s[i] = (v*s[i-l])mod p, where "mod p" indicates that the multipUcation is effected 
modulo-p. 

Finally, an intracolumn permutation scheme is calculated for each column j. 

For a given column j, C intracolumn permutation elements Uj are calculated in accordance 

with the calcidation mode described below, given for C = p: 

20 - Uj[i] = s[(i*r[j])mod(p-l)] for i = 0, 1, , p-2 

-Uj[p-1] = 0 

It may be demonstrated that the expression Uj[i] = s[(i*r[j])mod(p-l)] is equal 

to: 

25 Uj[i+1] = (v'[j]* Uj[i])mod p, where v'[j] is a new primitive root equal to v'^^^ 

Actually: 

- The expression s[i] = (v*s[i-l])mod p is equal to the e3q)ression: 
s[i] = (V*s[0])mod p = Vmod p. 
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- Consequently, the expression Uj[i] = s[(i*r[j])mod(p-l)] is equal to the expression 

- If one writes a = v and i*r[j] =b: 

- a^'mod p = |-a''*'^^][a^"''**^ *"^^]mod p, where n is such that 
b = n(p-l) + bmod (p-1). 

- thus a^mod p = [a'^^P'^Wd p][a^°^^*^ ^^"^ Vod p 

= [(a^-'))"mod plEa^'"*^^ Vod p 

= [a*-'>mod pl^ta^'"^ ^^>]mod p ' . 

- If p is a prime number and if the greatest conmion divisor between a and p is 1, then 
a^'^Wd p = 1. In this example a = v and v is never equal to p, which implies that the 
greatest common divisor between a and p is 1. Thus 

[a*-^>mod p]" = 1 : Consequently, a^mod p = a^""°** ^P-^>mod p * 

- If a is replaced by v and b by i*r[j] in this expression, one obtains: 

v**'"^mod p = v^'*'"l>°*°*^<P-^>mod p = Uj[i] 

- This expression is equal to the expression: Uj[i] = (v'[j])Wd p, where 
v'[j] = v^tj] 

- By applying this expression in a recursive fashion, one obtains: 

Uj[i+1] = (v'[j]* Uj[i])mod p 

Fig. 8 illustrates a system comprising a circuit according to the invention for 
calculating the intracolxunn permutation elements described above. 

Such a system comprises a calculation device 800 and transport means 801. 
The calculation device comprises fifteen registers Rl to R15, seven modulo-p shift elements 
SMPl to SMP7, eight multiplexers MUXl to MUX8 and seven modul6-p adders AMP2 to 
AMP8. The transport means 801 comprise twelve registers R16 to R27- The system fiirther 
comprises regulation means in the form of a multiplexer MUX9. 

The calculation device 800 permits to perform a modulo-p multipUcation 
between two data x and y which are smaller than p. Let us suppose that x and y are written m 
binary language ia eight bits from the least significant to the most significant bit: 

X = x(0) x(l) x(2) x(3) x(4) x(5) x(6) x(7) 

y = y(0) y(l)y(2)y(3)y(4)y(5)y(6)y(7) 

During a stage 81 the data x is sent to the modulo-p shift element SMPl. If the 
bit y(0) has the value 1, the value x is copied in the register R8 thanks to the. multiplexer 
MUXl. If the bit y(0) has the value 0, the value 0 is copied in the register R8. 
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The modulo-p shift element shifts the data x to the left and compares the data 
obtained with p. This data obtained is written as: 

x(l) x(2) x(3) x(4) x(5) x(6) x(7) 0 

If this data obtained is larger than p, a modulo-p operation is carried out witii 
this obtained data and the residt of this operation is written in the register Rl. If the data 
obtained is smaller than p it is copied in the register Rl . 

During a stage 82 the data stored in the register Rl is sent to the modulo-p 
shift element SMP2 and the multiplexer MUX2. Each step requires a clock cycle for 
activating the registers. If the second bit y(l) has the value 1, the data stored in the register 
Rl is sent to the modulo-p adder AMP2. If the second bit y(l) has the value 0, the value 0 is 
sent to the modulo-p adder AMP2. The data stored in the register R8 is also sent to the 
modulo-p adder AMP2. The modulo-p adder AMP2 performs a modulo-p addition of its two 
input values and sends the result to the register R9, 

Similar operations are carried out during the stages 83 to 88 and the result of 
the mbdulo-p multiplication between x and y is obtained at the output of the modulo-p adder 
AMPS. 

The calculation of intracolumn permutation elements by the circuit of Fig. 8 is 
described hereinafter. 

The new primitive roots v'[j] and the intracolxmin permutation elements are 
written in eight bits if the number of rows R of the interleaving matrix has the value 10 or 20 
and in five bits if R has the value 5 . 

Let us suppose that the new primitive roots v'[j] and the intracolumn 
permutation elements are written in eight bits. In that case a modulo-p multiplication between 
a new primitive root and an intracolunm permutation element requires 8 clock cycles. 

To calculate the intracolumn permutation element Uo[l], the mtracolumn 
permutation element Uo[0] is sent to the modulo-p shift element SMPl and to flie multiplexer 
MUXl during stage 81. After a first clock cycle the stage 82 is carried out during a second 
clock cycle. During this second clock cycle tiie intracolumn permutation element Ui[0] is 
sent to the modulo-p shifter SMPl and to the multiplexer MUXl in order to carry out the first 
modulo-p multiplication stage between v'[l] and Ui[0], whereas the second stage of the 
modulo-p multiplication between v'[0] and Uo[0] is carried out 

Fig. 8 illustrates the calculations carried out during an eighth clock cycle. The 
eighth stage of the modulo-p multiplication between v'[0] and Uo[0] is carried out in which 
the multiplexer MUX8 verifies whether the eighth bit v'[0](7) of the new primitive root v'[0] 
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has the value 1. The seventh stage of the modulo-p multiplication between v'[l] and Ui[0] is 
carried out in which the multiplexer MUX7 verifies whether the seventh bit v'[l] (6) of the 
new primitive root v'[l] has the value 1 and so on. The first stage of the modulo-p 
multiplication between v'[7] and UvEO] is carried out in which the multiplexer MUXl verifies 
5 whether the first bit v'[7](0) of the new primitive root v'[7] has the value 1. 

At the end of the eighth clock cycle the intracolumn permutation element 
Uo[l] is calculated and stored in the register R15. Let us suppose that the interleavmg matrix 
has 20 rows. For each colimm twenty intracolumn permutation elements are to be calculated. 
. The intracolumn permutation elements Uo[l] to Ui9[l] are thus calculated, then the element 
10 Uo[2] is calculated based on Uo[l], the element Ui[2] based on Ui[l] and so on. 
. Consequently, each intracolunm permutation element calculated by the calculation device 
800 is used again by this calculation device 800 twelve clock cycles after having been 
calculated. The transport means 801 which comprise twelve registers R16 to R27 permit to 
move one data firom the output to the input of the calculation device 800 in twelve clock 
15 cycles. 

Let us suppose that the interleaving matrix has 1 0 rows. For each colunm j ten 
intracolumn permutation elements are to be calculated. Consequently, each intracolumn 
permutation element calculated by the calculation device 800 is used again by this calcxilation 
device 800 two clock cycles after having been calculated. Thanks to the multiplexer MUX9 it 
20 is possible to select the data on the ou^ut of tiie register R17 in order to transport them from 
the output to the input of the calculation device 800 in two clock cycles. 

The verb "to comprise" and its conjugations are to be interpreted in a broad 
way, that is to say, as not excluding the presence of not only other elements than those listed 
25 after said vierb, but also a plurality of elements akeady mentioned after said verb and 
preceded by the word "a" or "an". 



